7 research outputs found

    OSIRIS-SR: a scalable yet reliable distributed workflow execution engine

    No full text
    Workflows provide an easy to use programming model for the construction of complex services that are (recursively) composed of simpler services. When it comes to high performance workflow execution, the distribution (outscaling) of the constituent services of the workflow across an environment of computational nodes is a key concept and also a very straightforward advantage of the workflow paradigm. However, scalable workflow execution cannot only be provided by the distribution of services but also necessitates novel architectures for the workflow engine in charge of service orchestration. Even though workflow orchestration is commonly provided by centralized solutions, these architectures imply performance bottlenecks and single points of failure. Hence, the workflow engine has to be distributed as well, by efficiently replicating workflow metadata across several nodes in a network. A particular challenge stems from the requirement of providing scalable worflow execution that is at the same time also reliable. In this paper, we present OSIRIS-SR, a decentralized middleware for the distributed execution of workfows. It has particularly been designed to jointly provide a high degree of scalability and reliability. OSIRIS-SR locally leverages the concurrent and redundant Actor model for worflow processing, whereas globally OSIRIS-SR runs a number of scalable system services for the management of worflow metadata, with the Safety Ring being the most prominent one. The Safety Ring service features a self-healing node overlay for the purpose of active workflow instance supervision that serves at the same time as a scalable and reliable metadata storage. We discuss in detail the Safety Ring architecture and the mechanics behind the scalable and reliable worflow management in OSIRIS-SR. The evaluation results of OSIRIS-SR show that support for reliable workflow execution does not significantly impact the system's scalability characteristics

    Optimized P2P data management for reliable workflow execution in mobile environments

    No full text
    Workflows allow to build complex applications out of existing services. Hence, workflows are inherently distributed. At the same time, workflows are mostly executed by centralized workflow engines that invoke the services of a workflow in a request/reply style. These engines may become a performance bottleneck and limit the scalability of the entire system. As a consequence, approaches to distributed workflow execution have been proposed. While these approaches have better scalability characteristics, they require additional efforts to deal with (partial) failures of the system. To cope with such failures, workflow instance data must be stored redundantly at different sites. This is even more important in mobile environments or mixed mobile/stationary environments where the nodes hosting services for workflows may be mobile and are thus more likely to fail permanently or become temporarily unavailable. This is the case, for instance, for sensor net applications where mobile devices capture data, or when smartphone apps share and jointly process data. In this paper, we present the combination of OSIRIS-SR, an extension to the distributed and decentralized workflow engine OSIRIS that focuses on reliable workflow execution by means of instance data replication, and Compass, an extension to Chord that is particularly tailored for efficient P2P data management on mobile devices

    Shepherd: Node monitors for fault-tolerant distributed process execution in OSIRIS

    No full text
    OSIRIS is a middleware for the composition and orchestration of distributed web services that follows a P2P decentralized approach to process execution, providing already some degree of resilience to faults and high performance in large-scale computational clusters. In this paper, we present on-going work aimed at improving OSIRIS' fault tolerance capabilities. We introduce in OSIRIS new architectural elements for the maintenance of a virtual stable storage and the monitoring of activities of service instances, together with algorithms that allow execution to survive also failures that the system is currently not able to cope with

    Safety ring: fault-tolerant distributed process execution in OSIRIS

    No full text
    The advent of service-oriented architectures (SOAs) has strongly facilitated the development and deployment of large-scale distributed (serviceoriented) applications. The middleware for orchestrating process-based applications that consist of several distributed services has to be inherently distributed as well, in order to provide a high degree of scalability and to avoid a single point of failure. Self-healing execution of such processes supported by a distributed middleware requires replicated control metadata and instance data of processes. Most importantly, replication has to be provided in a way that does not affect the adaptivity and elasticity behavior of the middleware for composite service execution. In this technical report, we introduce OSIRIS Safety Ring, a novel approach to fault-tolerant process execution. Safety Ring is based on OSIRIS, a distributed and decentralized middleware for the execution of composite services. Essentially, the Safety Ring exploits dedicated node monitors, organized in a self-organizing ring structure, for the replication of control data. Moreover, it leverages virtual stable storage for managing process instance data in a robust way. We present the architecture of OSIRIS’ Safety Ring and discuss in detail the algorithms it applies for self-healing process execution. The performance evaluation shows that the additional gain in robustness has only marginal effects on the scalability characteristics of the system

    Optimized P2P Data Management for ReliableWorkflow Execution in Mobile Environments

    No full text
    Workflows allow to build complex applications out of existing services. Hence, workflows are inherently distributed. At the same time, workflows are mostly executed by centralized workflow engines that invoke the services of a workflow in a request/reply style. These engines may become a performance bottleneck and limit the scalability of the entire system. As a consequence, approaches to distributed workflow execution have been proposed. While these approaches have better scalability characteristics, they require additional efforts to deal with (partial) failures of the system. To cope with such failures, workflow instance data must be stored redundantly at different sites. This is even more important in mobile environments or mixed mobile/stationary environments where the nodes hosting services for workflows may be mobile and are thus more likely to fail permanently or become temporarily unavailable. This is the case, for instance, for sensor net applications where mobile devices capture data, or when smartphone apps share and jointly process data. In this paper, we present the combination of OSIRIS-SR, an extension to the distributed and decentralized workflow engine OSIRIS that focuses on reliable workflow execution by means of instance data replication, and Compass, an extension to Chord that is particularly tailored for efficient P2P data management on mobile devices

    COMPASS – Optimized Routing for Efficient Data Access in Mobile Chord-based P2P Systems

    No full text
    During the last decade, overlay networks based on distributed hash tables have become the de facto standard for data management in Peer-to-Peer (P2P) systems, with Chord being its most prominent representative. Essentially, with its fully decentralized approach, Chord avoids any bottleneck and single point of failure while guaranteeing data to be retrieved in O(logN) hops in a network consisting of N nodes. By optimizing the number of hops for data access, Chord implicitly assumes that all connections between nodes have comparable bandwidth and latency characteristics. However, in heterogeneous, mobile P2P systems that consist of both mobile and fixed nodes, this is not the case. Moreover, due to the mobility of nodes, connection parameters can dynamically change. Especially in mobile P2P applications where low latency for data access is essential, such as in emergency management, routing should aim at reducing the overall latency, rather than the number of hops in the network. In this paper, we present COMPASS, a protocol for efficient data access in heterogeneous mobile Chord-based P2P systems. COMPASS takes into account that the network latency of nodes in a mobile P2P network may significantly differ and thus aims at minimizing the overall latency, even if this necessitates more hops in the network. This is done by probing the network and by maintaining, in addition to Chord’s finger table, at each peer a data structure called COMPASS table. We present in detail the initialization and maintenance of the COMPASS table that dynamically adapts to changing node characteristics. Evaluation results show that COMPASS outperforms standard Chord-based routing and reduces the overall latency in heterogeneous P2P networks consisting of fixed and mobile nodes

    COMPASS – latency optimal routing in heterogeneous chord-based P2P systems

    No full text
    During the last decade, overlay networks based on distributed hash tables have become the de facto standard for data management in Peerto- Peer (P2P) systems, with Chord being its most prominent representative. Essentially, with its fully decentralized approach, Chord avoids any bottleneck and single point of failure while guaranteeing data to be retrieved in O(logN) hops in a network consisting of N nodes. By optimizing the number of hops for data access, Chord implicitly assumes that all connections between nodes have comparable bandwidth and latency characteristics. However, in heterogeneous, mobile P2P systems that consist of both mobile and fixed nodes, this is not the case. Moreover, due to the mobility of nodes, connection parameters can dynamically change. Especially in mobile P2P applications where low latency for data access is essential, such as in emergency management, routing should aim at reducing the overall latency, rather than the number of hops in the network. This technical report is based on our paper published at the MDM conference 2013 and extends it by new features and evaluations. In this report, we present COMPASS, a protocol for efficient data access in heterogeneous mobile Chord-based P2P systems. COMPASS takes into account that the network latencies of nodes in a mobile P2P network may significantly differ and thus aims at minimizing the overall latency, even if this necessitates more hops in the network. This is done by probing the network and by maintaining, in addition to Chord’s finger table, at each peer a data structure called COMPASS table. We present in detail the initialization and maintenance of the COMPASS table that dynamically adapts to changing node characteristics. Moreover in this report we first time present the new interval joining feature which reduces the COMPASS table size by joining similar intervals. Real world evaluation as well as simulation results show that COMPASS outperforms standard Chord-based routing and reduces the overall latency in heterogeneous P2P networks consisting of fixed and mobile nodes
    corecore